Positive Selection Drives cis-regulatory Evolution Across the Threespine Stickleback Y Chromosome

Abstract Allele-specific gene expression evolves rapidly on heteromorphic sex chromosomes. Over time, the accumulation of mutations on the Y chromosome leads to widespread loss of gametolog expression, relative to the X chromosome. It remains unclear if expression evolution on degrading Y chromosomes is primarily driven by mutations that accumulate through processes of selective interference, or if positive selection can also favor the down-regulation of coding regions on the Y chromosome that contain deleterious mutations. Identifying the relative rates of cis-regulatory sequence evolution across Y chromosomes has been challenging due to the limited number of reference assemblies. The threespine stickleback (Gasterosteus aculeatus) Y chromosome is an excellent model to identify how regulatory mutations accumulate on Y chromosomes due to its intermediate state of divergence from the X chromosome. A large number of Y-linked gametologs still exist across 3 differently aged evolutionary strata to test these hypotheses. We found that putative enhancer regions on the Y chromosome exhibited elevated substitution rates and decreased polymorphism when compared to nonfunctional sites, like intergenic regions and synonymous sites. This suggests that many cis-regulatory regions are under positive selection on the Y chromosome. This divergence was correlated with X-biased gametolog expression, indicating the loss of expression from the Y chromosome may be favored by selection. Our findings provide evidence that Y-linked cis-regulatory regions exhibit signs of positive selection quickly after the suppression of recombination and allow comparisons with recent theoretical models that suggest the rapid divergence of regulatory regions may be favored to mask deleterious mutations on the Y chromosome.


Introduction
The evolution of heteromorphic sex chromosomes has occurred many times among species (Bachtrog 2013;Bachtrog et al. 2014).Heteromorphic sex chromosomes evolve once recombination is suppressed between the X and Y (or Z and W) (Muller 1918;Charlesworth 1978;Charlesworth and Charlesworth 2000).After recombination is suppressed, the Y chromosome rapidly accumulates mutations, leading to sequence degeneration (Charlesworth and Charlesworth 2000).Empirical evidence of this process has focused on the sequence evolution of coding regions on sex chromosomes.Broad comparative work has revealed that coding sequence evolution can follow one of several different evolutionary trajectories.Although many ancestral Y-linked genes are lost because of sequence degeneration through the accumulation of deleterious mutations, growing evidence indicates not all genes are lost.Some genes appear to be dosage sensitive and are under strong purifying selection to be retained on the Y chromosome (Bellott et al. 2014(Bellott et al. , 2017;;White et al. 2015;Peichel et al. 2020).Novel genes with sex-specific functions can also accumulate on sex-limited chromosomes via translocation and subsequent gene duplication (Soh et al. 2014;Mahajan and Bachtrog 2017;Ellison and Bachtrog 2019;Hughes et al. 2020;Peichel et al. 2020;Chang et al. 2022).
The accumulation of mutations is not restricted to coding sequences and can also occur in cis-regulatory regions, leading to changes in expression from the sex-limited chromosome.Gametolog expression (ancestral genes still shared between the X and Y chromosomes) has been shown to evolve rapidly on degenerating sex chromosomes, often resulting in the loss of expression from the sex-limited chromosome (Y and W) (Meisel et al. 2012a;Muyle et al. 2012Muyle et al. , 2018;;Ayers et al. 2013;Singh et al. 2014;White et al. 2015;Beaudry et al. 2017;Rodríguez Lorenzo et al. 2018;Martin et al. 2019;Veltsos et al. 2019;Wei and Bachtrog 2019;Shaw and White 2022).Although deleterious regulatory mutations may accumulate through selective interference (Charlesworth and Charlesworth 2000;Bachtrog 2008), selection may also favor mutations within cis-regulatory regions to downregulate coding regions with deleterious mutations (Orr and Kim 1998;Bachtrog 2006).Recent theory supports the role of positive selection driving the rapid accumulation of mutations to downregulate the Y-linked allele and upregulate the X-linked allele to maintain ancestral dosage balance (Lenormand et al. 2020;Lenormand and Roze 2022).
One challenge to characterizing the molecular evolution of regulatory regions on sex chromosomes is that complete Y chromosome assemblies are only available for a limited number of taxa.Reference assemblies are needed to identify the substitutions that are accumulating within noncoding regions.Additionally, many of the available Y chromosome assemblies are from model organisms that have sex chromosomes that are highly degenerated (Hughes et al. 2010(Hughes et al. , 2020;;Bellott et al. 2014;Soh et al. 2014;Tomaszkiewicz et al. 2016;Mahajan et al. 2018).These species only have a few remaining ancestral gametologs on the Y chromosome, thus limiting the number of genes available to study how cis-regulatory evolution potentially leads to expression differences between the X and Y. Species with recently derived sex chromosomes that still harbor many ancestral Y-linked gametologs at varied stages of degeneration are needed to understand how substitutions within cis-regulatory regions lead to the evolution of expression differences between the X and Y.
In addition to having a chromosome-scale assembly of the Y chromosome, an additional challenge is annotating regulatory regions (reviewed in Shaw and White 2022).Recent approaches have focused on identifying regions of the genome with accessible chromatin that may contain transcription factor binding sites to regulate gene expression (Ricci et al. 2019).Chromatin accessibility profiling techniques, like assay for transposase accessible chromatin sequencing (ATAC-seq) (Buenrostro et al. 2013), utilize short-read sequencing to profile accessible chromatin regions (ACRs) at a fine scale across the genome.Across autosomes, ACRs show signatures of purifying selection relative to other intergenic regions, consistent with these regions containing important functional elements for gene regulation (Connelly et al. 2014;Lu et al. 2019;Horvath et al. 2021).Profiling ACRs on sex chromosomes would therefore provide a means to study the molecular evolution of cisregulatory regions in the context of Y degeneration.
The threespine stickleback fish (Gasterosteus aculeatus) is an excellent model to study the evolution of cis-regulatory regions on sex chromosomes.Threespine stickleback fish have a high-quality reference assembly of the Y chromosome and this chromosome is more recently derived compared to many other species with available chromosome-scale assemblies (Hughes et al. 2010(Hughes et al. , 2012(Hughes et al. , 2020;;Bellott et al. 2014Bellott et al. , 2017;;Peichel et al. 2020).The threespine stickleback sex chromosomes contain multiple evolutionary strata at different stages of degeneration (Roesti et al. 2013;White et al. 2015;Peichel et al. 2020), allowing for comparisons of sequence evolution at different temporal scales.Strata form if recombination is suppressed between sex chromosomes in multiple steps (Bachtrog 2013).Each subsequent phase of recombination suppression is followed by the accumulation of substitutions on the Y chromosome, leading to increased sequence divergence between the X and Y.The age of individual strata can therefore be distinguished by varied levels of synonymous divergence.In threespine stickleback fish, the oldest stratum is estimated to have formed around 22 million years ago.Two younger strata formed between 4 and 6 million years ago (Peichel et al. 2020;Sardell et al. 2021).Stratum 1 has undergone the most extensive degeneration, with only 18% of ancestral coding regions remaining on the Y chromosome (Peichel et al. 2020).Many of the coding regions of these gametologs that remain in stratum 1 on the Y chromosome show signatures of purifying selection and are enriched for dosage-sensitive functions (White et al. 2015;Peichel et al. 2020).Interestingly, chromosome-wide dosage compensation has not evolved to counter the loss of expression for genes where the coding sequence has degenerated on the Y chromosome (White et al. 2015).The lack of chromosome-wide dosage compensation enables a gene-by-gene comparison of cis-regulatory changes to understand how expression of X-linked alleles evolves in response to degenerating coding regions on the Y.
Here, we leveraged ATAC-seq from 2 different tissues to identify cis-regulatory regions on the sex chromosomes and an autosome.We found sex-linked cis-regulatory regions exhibited signatures of positive selection, relative to nonfunctional, intergenic control regions.Our findings complement existing sex chromosome evolution theory that suggests the accumulation of mutations within cisregulatory regions may be beneficial to silence expression from a degenerating Y chromosome.

Nucleotide Substitutions are Accumulating at High Rates Within Accessible Chromatin Regions
The completion of the threespine stickleback Y chromosome assembly (Peichel et al. 2020) allowed us to thoroughly examine regulatory evolution among sex-linked gametologs shared between the sex chromosomes.ACRs putatively contain cis-regulatory elements that serve as domains for transcription factor binding.High sequence divergence within ACRs, therefore, represents regions of the genome that are likely experiencing cis-regulatory evolution.We first sought to compare the divergence of ACRs between the X and Y chromosomes to previously characterized synonymous divergence throughout coding regions.Crossing over was suppressed across the threespine stickleback sex chromosomes in at least 3 separate events, forming strata with distinct levels of divergence (stratum 1: oldest; stratum 2: intermediate; stratum 3: youngest) (Peichel et al. 2020).To survey ACRs, we utilized ATAC-seq from liver tissue (Naftaly et al. 2021) and testis tissue.We defined a set of noncoding ACRs that were likely present in the ancestor of the X and Y chromosomes, by identifying X-linked ACRs within intergenic sequence that still had homologous sequence on the Y chromosome and homologous sequence in the ancestral autosome (homologous sequence in ninespine stickleback chromosome 19).To calculate divergence, we compared these regions with the homologous sequence on the Y chromosome, which may no longer be associated with accessible chromatin.In the liver, we found a total of 1,279 Shaw et al. • https://doi.org/10.1093/molbev/msae020MBE X-linked ACRs.We were able to align 948 (74.1%) of these ACRs to homologous regions on the Y chromosome and the ancestral ninespine stickleback (Pungititus pungitius) autosome 19.869 liver ACRs were within 50 kb of a coding region shared by all three chromosomes.In the testis, we identified 896 X-linked ACRs.Compared to the liver, we found far fewer orthologous regions on the Y chromosome and autosome 19 in the ninespine stickleback (118 total aligned; 13.2%), suggesting cis-regulatory regions within testis ACRs may be more rapidly evolving.All 118 of the aligned testis ACRs were within 50 kb of a coding region shared by all 3 chromosomes.We used these aligned sets of ACRs that were close to conserved coding regions for all subsequent analyses.
We estimated divergence between the X and Y chromosomes among ACRs in each tissue and compared this to synonymous and nonsynonymous substitutions within coding regions across the sex chromosomes.Coding regions across all 3 strata on the threespine stickleback Y chromosome are accumulating nonsynonymous substitutions at a higher rate than autosomes due to reduced efficacy of selection (White et al. 2015;Peichel et al. 2020).We first compared the divergence among testis and liver ACRs to nonsynonymous substitution rates within coding regions to determine if mutations were accumulating at a similar rate.We found liver and testis ACRs have much higher divergence compared to nonsynonymous mutations among all 3 strata (Fig. 1) (P < 0.001; Kruskal-Wallis and Dunn's Test).We also compared ACR divergence with synonymous substitutions, which is used as measure of neutral evolution.Testis ACRs had divergence higher than neutral, synonymous substitutions in all 3 strata (Fig. 1) (P < 0.001; Kruskal-Wallis and Dunn's Test).Liver ACRs exhibited divergence that was higher than synonymous substitutions in the 2 oldest strata (1 and 2) (P < 0.001; Kruskal-Wallis and Dunn's test) (Fig. 1), indicating these regions are also evolving rapidly.In the youngest stratum (3), the liver ACRs were not significantly different than synonymous substitutions (P = 0.413, Kruskal-Wallis and Dunn's test).Additionally, ACR divergence was distinct between all 3 evolutionary strata (P < 0.001; Kruskal-Wallis and Dunn's test) and corresponded with the extent of evolutionary divergence of each stratum.ACR divergence higher than both nonsynonymous and synonymous substitutions strongly suggests cis-regulatory Fig. 1.XY divergence within ACRs and coding regions.ACRs show elevated divergence between the X and Y, relative to nonsynonymous and synonymous substitutions in coding regions.Protein-coding divergence was estimated between the X and Y chromosomes at all shared gametologs.ACRs were identified on the X chromosome and mapped to the Y chromosome.Substitutions within ACRs were summed and divided by the total length of the ACR to calculate divergence.Stratum 1: 12 gametologs, 442 liver ACRs, 30 testis ACRs; Stratum 2: 71 gametologs, 228 liver ACRs, 37 testis ACRs; Stratum three: 46 gametologs, 199 liver ACRs 51 testis ACRs.Rates of sequence divergence for all 4 categories were compared using a Kruskal-Wallis test for each stratum (P < 0.001 for all strata).A post hoc Dunn's test was used to identify which groups were significantly different from one another.P-values were adjusted for multiple comparisons with a Bonferroni correction.The letters above each plot indicate the significant differences between sites within each stratum (P < 0.001).Box plots that share a letter assignment are not significantly different from one another (P > 0.001).
Positive Selection Drives cis-regulatory Evolution • https://doi.org/10.1093/molbev/msae020MBE regions are evolving rapidly across all 3 strata.However, this analysis alone cannot rule out the possibility that purifying selection may still be operating within coding regions on the Y chromosome, reducing nonsynonymous substitutions.Synonymous substitutions may also be under weak purifying selection (i.e.codon bias) (Bachtrog 2005;Lawrie et al. 2013).In this scenario, we may observe higher divergence rates within ACRs relative to coding regions.
We, therefore, sought to compare ACR divergence with nonfunctional regions throughout the sex chromosomes as well as ACR divergence throughout a representative autosome.We used the orthologous sequence from the ninespine stickleback fish to infer substitution rates throughout the X and Y chromosomes separately as well as throughout autosome 18.We defined nonfunctional control regions as intergenic sequence of the chromosomes that did not overlap known gene annotations, repetitive elements, or our previously defined ACRs (see Materials and Methods).Although our ability to define nonfunctional regions is limited to the currently available annotations, this represents our best approximation of regions of the genome that are not under selection.If ACRs are evolving faster than expected under neutrality, we would predict the average ACR sequence divergence would exceed the rate of divergence observed across permutations of nonfunctional control regions (Fig. 2a).However, if ACR divergence is similar to the control regions (Fig. 2a), this would suggest substitutions are accumulating in ACRs at the same rate as the remainder of the Y chromosome, which is subject to sequence degeneration across the noncrossover region (Charlesworth and Charlesworth 2000).
ACRs functional in the liver exhibited higher substitution rates on the Y chromosome than on the X chromosome.Y-linked ACRs had significantly higher substitution rates compared to the nonfunctional control regions for the 2 oldest evolutionary strata (Fig. 2b; P < 0.001; stratum 1 and 2; 10,000 permutations).In contrast to the Y-linked ACRs, we found the X-linked liver ACR sequence divergence was significantly lower than nonfunctional control regions (Fig. 2b; Stratum 1: P = 0.016; Stratum 2 and 3: P < 0.001; 10,000 permutations).This suggests that ACRs may be functionally constrained more on the X chromosome, compared to Y-linked ACRs.We observed a similar pattern on the Y chromosome for ACRs functional in testis tissue.Y-linked ACRs had significantly higher substitution rates in all 3 evolutionary strata, compared to nonfunctional control regions (Fig. 2b; P < 0.001, 10,000 permutations).However, we observed X-linked ACRs from testis tissue, also exhibited elevated nucleotide substitutions in the oldest 2 evolutionary strata (Fig. 2b; P < 0.001; 10,000 permutations), while the youngest stratum was significantly lower than intergenic regions (Fig. 2b; P < 0.001; 10,000 permutations).This indicates some cisregulatory elements exhibited accelerated substitution rates on the X chromosome and this occurs in a tissuespecific fashion.
Autosomal ACRs have been shown to be under purifying selection in other species (Horvath et al. 2021).We therefore tested whether threespine stickleback autosomal ACRs also exhibited purifying selection, similar to the pattern we observed on the X chromosome in liver.Consistent with patterns of purifying selection, we found autosomal ACRs exhibited lower sequence divergence compared to nonfunctional control regions in both liver (Fig. 2b; P = 0.002; 10,000 permutations) and testis tissue (Fig. 2b; P < 0.001; 10,000 permutations).

Accessible Chromatin Regions Exhibit Signatures of Selection
An excess of substitutions on the Y chromosome could be caused by relaxed purifying selection or positive selection.To distinguish between these alternatives, we used shortread sequencing of 12 males from a freshwater population of stickleback fish (Shanfelter et al. 2019) to search for signatures of selection within ACRs.In the absence of selection, the ratio of polymorphism to sequence divergence should be equal between functional and nonfunctional sites.Deviation from this expectation would imply these regions are under selection.To explore this, we performed a modified McDonald-Kreitman (MK) test on noncoding sequences (Andolfatto 2005).We compared the number of X-and Y-specific substitutions to the number of polymorphisms segregating within a population for ACRs and nonfunctional control sequences.Except for stratum 2 on the X chromosome, sex chromosome ACRs from the liver and the testis had a greater proportion of sites fixed by positive selection (α), compared to autosomal ACRs (P < 0.001; chi-square test) (Table 1).We also identified the direction of selection acting on ACRs by calculating the odds-ratio of the MK test, known as the neutrality index (NI) (Rand and Kann 1996).Liver ACRs of the sex chromosomes had an NI < 1 (P < 0.001; chi-square test), consistent with positive selection (Fig. 3; Table 1).The only exception we found was for one of the younger strata on the X chromosome (stratum 2), which had an elevated NI consistent with purifying selection.Despite having a mean divergence lower than intergenic sequence, we found signatures of positive selection on the X chromosome for strata 1 and 3.In these cases, the departure from neutrality was driven by the low number of polymorphisms within ACRs compared to intergenic sequences, rather than high divergence within ACRs (Table 1).In these cases, the reduced polymorphism may be due to recent positive selection (Hughes et al. 2008;Stoletzki and Eyre-Walker 2011).Each stratum on the Y chromosome had a clear signature of positive selection (NI < 1; P < 0.001).Compared to the X chromosome and the autosome, the Y chromosome liver ACRs had stronger signatures of positive selection, indicative of a nonoverlapping bootstrap interval.For the testis ACRs, NI was indistinguishable between the X and Y chromosome, and both sex chromosomes had nonoverlapping bootstrap intervals with the representative autosome, indicating stronger positive selection acting on sex-linked ACRs.Similar to X-linked ACRs in liver, we found that testis ACRs in ACR substitution rates (red) lower than random nonfunctional control regions (white) and synonymous substitutions (blue) would indicate purifying selection (left).ACR substitution rates roughly equal to nonfunctional control regions would indicate mutations are accumulating due to inefficient selection from the loss of recombination (middle).ACR substitution rates higher than nonfunctional control regions would indicate positive selection.b) Y-linked ACR divergence is elevated over nonfunctional intergenic regions.X and Y ACR divergence was compared to randomly drawn nonfunctional regions matched for GC content (10,000 permutations).The average ACR divergence of each stratum is shown by the dotted line (stratum 1: pink, stratum 2: green, and stratum 3: blue).ACRs were identified on the X chromosome and mapped to the Y chromosome and homologous ninespine stickleback autosome 19 from liver (top) and testis (bottom).Autosome divergence is from chromosome 18.For the sex chromosomes, we identified variants that were unique to the X and Y, by aligning to the outgroup and ignoring fixed differences between species that likely occurred before the sex chromosomes evolved.X-and Y-specific variants within ACRs were summed and divided by the total length of the ACR to calculate divergence.Stratum 1: 442 liver ACRs and 30 testis ACRs; stratum 2: 228 liver ACRs and 37 testis ACRs; stratum 3: 129  liver ACRs and 46 testis ACRs; autosomal: 1056 liver ACRs and 305 testis ACRs Positive Selection Drives cis-regulatory Evolution • https://doi.org/10.1093/molbev/msae020MBE stratum 3 on the X chromosome exhibited an NI < 1, due to a low number of polymorphisms in ACR rather than an excess in divergence.
We compared the patterns we observed across ACRs with coding regions.We found that coding regions across the entire Y chromosome had an excess of nonsynonymous polymorphisms (Fig. 3; NI > 1; P < 0.001; chi-square test), consistent with relaxed purifying selection on a degenerating Y chromosome.This pattern was largely driven by the youngest 2 strata of the Y chromosome (Table 2).
The oldest stratum (1) did not deviate from neutral expectations (P = 0.112).Coding regions across the X chromosome were not significantly different than neutral expectations (NI = 1; P = 0.281; chi-square test).When split by evolutionary strata only the youngest stratum (3) deviated from neutral expectations after correcting for multiple comparisons.In stratum 3, the coding regions exhibited purifying selection (NI > 1; Pn < Ps; P < 0.001; chi-square test).Unlike the Y chromosome, the entire X chromosome did not have an excess of nonsynonymous

Y-Linked Substitutions are Correlated With Biased Expression From the X Chromosome
Selection should favor the loss of expression from the Y chromosome in order to silence coding regions that have accumulated deleterious mutations (Orr and Kim 1998;Bachtrog 2006;Lenormand et al. 2020).To test if Y-linked cis-regulatory divergence was associated with changes in expression, we compared ACR sequence divergence to changes in allelic expression on the X and Y from liver and testis RNA-seq transcriptomes (Peichel et al. 2020).To maximize the total number of genes for this analysis, we pooled genes across all 3 evolutionary strata.We found that sequence divergence among liver ACRs that were proximal to genes was a predictor of allele-specific expression.We observed higher expression of the X-linked gametolog, relative to the Y-linked gametolog, when ACRs on the Y chromosome had more substitutions (Fig. 4A; N = 51; P = 0.004; R = 0.4; Spearman's rank correlation; 95% Confidence Interval via bootstrapping (0.1249, 0.5769)), indicating that ACR divergence explains around 16% of the X-biased expression observed in liver tissue We also observed a correlation between expression and substitutions within ACRs on the X chromosome (Fig. 4C; N = 51; P = 0.033; R = 0.30; Spearman's rank correlation; 95% Confidence Interval from bootstrapping (0.0909, 0.4742)).The slopes of each linear regression were indistinguishable between the X and Y chromosome (P = 0.590; Fisher Z transformation).Although we do not have gene expression from the ninespine stickleback to determine which sex chromosome is deviating from the ancestral expression level, the Y-linked substitutions indicate the X-biased expression we observe is likely due to downregulating the Y chromosome.A similar trend was observed in testis tissue for Y-linked ACR divergence (Fig. 4), but the correlations were not significant (Fig. 4a; N = 31; P = 0.26; R = 0.201; Spearman's rank correlation; 95% Confidence Interval via bootstrapping (−0.0095, 0.5706)).There are fewer testis-expressed genes in this analysis.Therefore, the lack of significance could be due to sample size.To identify if sample size Positive Selection Drives cis-regulatory Evolution • https://doi.org/10.1093/molbev/msae020MBE had an effect, we performed a randomized down-sampling for the liver-expressed genes.Even with down-sampling, 68.0% of the permutations still produced significant correlations between ACR divergence and X-biased expression in the liver (P < 0.05; 10,000 permutations).This indicates the lack of correlation observed with the testis-expressed genes is likely not due to a smaller sample size.Despite having greater ACR divergence in testis compared to liver, 22 gametologs exhibited Y-biased expression in testis (Table 3).This suggests Y ACR evolution also contributes to up-regulation of male-specific genes.

Cis-regulatory Evolution is not Correlated With Deleterious Mutations in Coding Regions
The down-regulation of Y-linked gametologs could be adaptive if loss of expression follows the accumulation of deleterious coding substitutions (Orr and Kim 1998).We searched for evidence of adaptive silencing on the Y chromosome by comparing ACR divergence to the ratio of nonsynonymous to synonymous substitutions (d N /d S ) within the coding region of each gene.An adaptive silencing model would be supported if Y-linked gametologs with an elevated d N /d S ratio also have elevated cis-regulatory divergence.Among genes shared between the X and Y chromosomes, we found no correlation between ACR nucleotide divergence and Overall, most genes on the Y seem to be evolving independently from neighboring regulatory regions, except for a small set of genes under positive selection.
We also examined whether ACR divergence was higher within genes that contained frameshift mutations, nonsense mutations, or in-frame deletions.These types of mutations are more likely to produce a nonfunctional or suboptimal peptide.We found no significant difference in the number of ACR substitutions from liver and testis tissue for coding regions that contained frameshift, nonsense mutations, or deletions, compared to coding regions that did not have these putatively deleterious mutations (functional) (supplementary fig.S5, Supplementary Material online) (all pairwise comparisons P > 0.05; Mann-Whitney U test).Overall, we found no evidence of cis-regulatory divergence associated with deleterious coding sequence divergence, suggesting the adaptive silencing model is unlikely the major driver of cis-regulatory evolution on the threespine stickleback sex chromosomes.

Discussion
The evolution of gene regulation on Y chromosomes has largely been studied in the context of RNA expression levels of gametologs (Muyle et al. 2012(Muyle et al. , 2018;;White et al. 2015;Beaudry et al. 2017;Martin et al. 2019;Veltsos et al. 2019).Over time, gametolog expression is generally lost across most genes on the Y chromosome.With the

MBE
growing number of Y chromosome reference assembles and sequencing methods to quickly profile functional cis-regulatory regions across the genome, it has become feasible to explore the molecular evolution of cisregulatory regions that have led to altered expression patterns.Here, we observed elevated nucleotide substitutions and reduced polymorphisms within cis-regulatory elements across the threespine stickleback Y chromosome, consistent with positive selection driving this process.We found that increased rates of cis-regulatory divergence were associated with X-biased expression of gametologs, suggesting loss of expression from the Y chromosome.
Without liver and testis gene expression from the ninespine stickleback as an ancestral comparison, we were unable to confirm that X-biased gametolog expression is a result of down-regulation of the Y chromosome or upregulation of the X chromosome.However, previous sequencing of a brain transcriptome showed expression patterns consistent with loss of Y expression rather than gain of X expression (White et al. 2015).It is therefore likely that the X-biased expression we observed in liver and testis tissue also occurs through the loss of Y chromosome expression, similar to what has been observed in other species with degenerating Y chromosomes (Muyle et al. 2012;Beaudry et al. 2017;Wei and Bachtrog 2019).Additional work will be necessary to identify the functional variants responsible for the X-biased expression pattern.We currently cannot determine whether the X-biased expression pattern is due to the accumulation of deleterious mutations from inefficient selection, adaptive substitutions that are accumulating in the ACRs, or a combination of the 2. Several theoretical models of sex chromosome evolution have been developed that predict how gene expression evolves over time.Like coding regions, cis-regulatory regions will accumulate deleterious mutations because of reduced efficacy of purifying selection (Charlesworth and Charlesworth 2000;Bachtrog 2008).Without the removal of deleterious mutations, substitution rates within regulatory and coding regions approach the rates observed among nonfunctional intergenic sites.Theory predicts that substitution rates within regulatory regions can be elevated further through positive selection favoring beneficial mutations.The adaptive silencing model (Orr and Kim 1998) posits that Y gametologs that have accumulated maladaptive coding substitutions would be favored to be downregulated through cis-regulatory mutations.Only the functional X-linked gametolog would be expressed in males.An alternative model suggests that deleterious mutations accumulate in cis-regulatory elements throughout the nonrecombining region, initially downregulating all Y-linked gametologs (i.e.degeneration by regulatory evolution (Lenormand et al. 2020;Lenormand and Roze 2022)).As deleterious mutations simultaneously accumulate within coding regions, a positive-feedback loop ensues, leading to a higher substitution rate within cis-regulatory elements to selectively downregulate increasingly maladaptive coding regions.Our survey of ACR evolution in threespine stickleback fish allowed us to compare with each of the previously discussed theories.
Evidence for adaptive silencing (Orr and Kim 1998) would be present if alleles with elevated rates of coding divergence between the X and Y chromosomes were more likely silenced on the Y chromosome.Our results did not support this model.We found that the accumulation of X or Y ACR divergence had no association with relaxed purifying selection in nearby coding regions.Our results complement previous findings that have compared coding sequence evolution to gametolog expression.The level of expression from the Y chromosome is often not correlated with the overall number of deleterious mutations within coding regions (Bachtrog 2006;Bachtrog et al. 2008;Beaudry et al. 2017).While an elevation of d N /d S has been previously used as a proxy for the accumulation of deleterious variants on the Y, this association could also be conflated by positive selection.Here, we showed the lack of an association still held when only looking at genes evolving under relaxed selection.Additionally, we found no correlation between divergence within an ACR and the functional state of a coding region (i.e. containing frameshift or nonsense mutations).We found cis-regulatory evolution was correlated with expression differences, but that ACR divergence accumulated independent of coding divergence.A correlation between nucleotide substitutions in cisregulatory regions and deleterious mutations in coding regions may be observed if ACRs and gene expression are profiled from additional tissues.In addition, we assigned ACRs to coding sequences based on overall proximity to a gene.Although this method of annotating ACRs is commonly used to assign cis-regulatory regions to genes (Connelly et al. 2014;Alexandre et al. 2018;Ricci et al. 2019), it is possible that some ACRs interact with other genes through long-range interactions based on chromatin configuration (Mifsud et al. 2015;Schoenfelder and Fraser 2019), reducing our ability to detect an association between deleterious mutations in coding regions and nucleotide substitutions in ACRs.Also, the number of substitutions within an ACR may not be a fully accurate predictor of the overall ability to silence a given gametolog on the Y chromosome.If only a small number of substitutions are needed to ablate transcription factor binding within an ACR, we would not expect to see a strong correlation between the number of deleterious mutations within coding sequences and the number of substitutions within ACRs.Additional functional work will be necessary to explore these alternatives.
Elevated substitution rates in cis-regulatory regions quickly following recombination suppression would be consistent with the model of degeneration by regulatory evolution (Lenormand et al. 2020;Lenormand and Roze 2022).Silencing of Y-linked alleles occurs first in this model, followed by the accumulation of deleterious mutations within coding regions.We found that putative cis-regulatory elements exhibited signatures of positive selection on the Y chromosome across all 3 evolutionary strata, including the 2 youngest strata that still contain most of the ancestral gene content (Peichel et al. 2020).The degeneration by regulatory evolution model predicts that the substitution rate within cis-regulatory elements Positive Selection Drives cis-regulatory Evolution • https://doi.org/10.1093/molbev/msae020MBE will accelerate once deleterious mutations begin to accumulate within coding regions.This will further suppress gametolog expression from the Y chromosome, masking maladaptive coding mutations.Consistent with this model, our results show cis-regulatory regions have stronger signatures of positive selection compared to autosomal ACRs, as well as both coding and noncoding regions on the sex chromosomes.This suggests there is selection to downregulate Y-linked alleles in liver.Alternatively, substitutions in ACRs could reflect adaptation to upregulate genes on the Y chromosome, to maintain expression, or to gain male-beneficial expression (Skaletsky et al. 2003;Martínez-Pacheco et al. 2020;Shaw and White 2022).Non-adaptive explanations should also be carefully considered as alternatives to adaptive models.For example, elevated substitution rates could be observed in Y-linked ACRs if they have higher mutation rates compared to synonymous and silent intergenic sites.Indeed, it has been suggested that male-biased mutation rates, due to increased cell division (Link et al. 2017), or specialized DNA repair pathways (Chang et al. 2022), could shape the evolution of the Y chromosome.These mutations could be fixed at a faster rate within populations and produce false adaptive signatures because of the smaller effective population size of the Y chromosome.However, the threespine stickleback fish Y chromosome does not show evidence of male-biased mutation rate at synonymous sites (White et al. 2015), and thus it remains unclear how these mutation biases would increase substitution rates uniquely at ACRs.
The degeneration by regulatory evolution model also indicates dosage compensation should evolve simultaneously for genes that are under strong stabilizing selection to maintain expression levels (Lenormand et al. 2020).Degeneration of cisregulatory elements on the Y chromosome and loss of expression should select for up-regulation of the gametolog on the X chromosome to compensate for dosage loss.In this scenario, nucleotide substitutions should accumulate within ACRs on the X chromosome.We found that X-linked ACRs had a depletion of polymorphism, consistent with positive selection.Unlike Y-linked ACRs, the X-linked ACRs had low divergence over longer evolutionary timespans between ninespine and threespine stickleback fish.This suggests that selection has acted on some X-linked sites more recently within threespine stickleback populations to fix beneficial regulatory mutations.This distinction is consistent with Y-linked ACR divergence accumulating first, as predicted by the degeneration by regulatory evolution model (Lenormand et al. 2020).Selection on X-linked cis-regulatory elements within ACRs may still be ongoing.A similar finding was observed on the neo-sex chromosomes of Drosophila miranda, where the neo-X chromosome was enriched for signatures of selective sweeps, while the ancestral X chromosome region was not (Bachtrog et al. 2009).
We found the strongest signatures of positive selection on the X chromosome were from testis ACRs.Testis ACRs may have faster rates of sequence divergence on both the X and Y chromosome, from processes similar to the faster X-effect (Charlesworth et al. 1987;Vicoso and Charlesworth 2009).There is widespread evidence for accelerated rates of coding evolution on X chromosomes compared to autosomes, especially in male-specific tissues (Baines et al. 2008;Meisel and Connallon 2013;Parsch and Ellegren 2013;Larson, et al. 2016;Kopania et al. 2022), as the X-linked mutations in hemizygous regions are always exposed to selection in males.Expression of X-linked gametologs may also evolve at faster rates (Khaitovich et al. 2005;Brawand et al. 2011;Meisel et al. 2012b;Coolon et al. 2015;Kopania et al. 2022), presumably due to mutations in cis-regulatory regions.The elevated substitution rate we observed on the X chromosome in testis ACRs may reflect male-beneficial selection to alter expression levels that are not necessarily connected to dosage compensation.
While we observed molecular signatures of positive selection within ACRs, we did not find sufficient evidence to suggest selection is associated with elevated X expression, as predicted by the degeneration by regulatory evolution model.While selection for dosage compensation could be ongoing, previous findings suggest that most genes on the threespine stickleback Y chromosome can be lost without consequence.Within the oldest stratum, over half of the Y-linked gametologs have been completely lost and males exhibit half expression, relative to females (White et al. 2015).Even across species with chromosome-wide dosage compensation mechanisms, many genes are not compensated completely (Cotton et al. 2013;Tukiainen et al. 2017), which suggests that loss of Y expression is not always deleterious.The dosage sensitivity of each sexlinked gene may be essential for predicting how rapidly cis-regulatory regions evolve on both sex chromosomes.In the degeneration by regulatory evolution model, if stabilizing selection is relaxed, chromosome-wide degeneration in cis-regulatory regions still occurs, the genes just do not evolve dosage compensation (Lenormand et al. 2020).Our results provide empirical support that cisregulatory degeneration can occur without dosage compensation.
The expression balance of dosage-sensitive genes can also be maintained if the Y-linked gametolog does not degenerate.The coding sequence of haploinsufficient genes on Y chromosomes have been shown to be maintained through purifying selection.These genes presumably cannot be lost from the Y chromosome (Bellott et al. 2014(Bellott et al. , 2017;;Peichel et al. 2020;Bellott and Page 2021).For these genes, if Y-linked cis-regulatory elements accumulated deleterious mutations, ancestral expression patterns could be maintained if compensatory transcription binding sites evolved.This would also result in signatures of increased substitution rates within ACRs on the Y chromosome, relative to intergenic regions.This type of compensatory evolution has been proposed for functionally critical genes in mammals (Chaix et al. 2008;Vermunt et al. 2016), andDrosophila (Landry et al. 2005;Arnold et al. 2014;Signor and Nuzhdin 2018) Recent models indicate inversions that suppress recombination between sex chromosomes can be selected to be retained within populations if sex-specific trans-acting regulators evolve to maintain optimal expression in both sexes (i.e.dosage compensation) (Lenormand and Roze 2022).This is based on the idea that if divergent X-and Y-linked cis-regulators were to recombine, it would lead to maladaptive expression levels in males or females.The evolutionary strata on the threespine stickleback Y chromosome evolved from at least 3 nested inversions that suppressed recombination (Peichel et al. 2020).Unlike this recent model, our results suggest these inversions were likely not under strong selection to be retained initially because of early evolution of dosage compensation.Chromosome-wide dosage compensation does not occur in the threespine stickleback (White et al. 2015) and we did not observe a corresponding evolution of X-linked up-regulation.Interestingly, the excess of X-linked mutations observed in testis ACRs may be reflective of a greater abundance of rapidly evolving sexspecific trans-regulators in the testes.However, additional work will be needed to identify the role that autosomal trans-regulators play in modifying gene expression of sex chromosomes in each sex, to test other aspects of the degeneration by regulatory evolution model.
Although we focused on the degeneration of cis-regulatory elements on the Y chromosome, it is important to note that Y chromosomes become masculinized over time and cis-regulatory elements may also be under positive selection for male-specific neo-or subfunctionalization (Soh et al. 2014;Mahajan and Bachtrog 2017;Bachtrog et al. 2019;Martínez-Pacheco et al. 2020;Peichel et al. 2020;Chang et al. 2022).Some of the ACRs we found with high divergence may actually reflect functional cis-regulatory regions associated with testis-specific genes.Interestingly, we found signatures of positive selection in ACRs near testis-expressed genes.Some of these genes were enriched for Y-biased expression.These findings provide evidence that ancestral gametolog expression levels can also change presumably through gain of new testisrelated functions.Additional work will be necessary to identify the novel function these regulatory regions provide during spermatogenesis.An increase in testis expression was also found for ancestral gametologs in a comparative study across 17 species of mammals (Martínez-Pacheco et al. 2020).In these species, most Y-linked gametologs gained novel testis expression patterns, compared to their X-linked alleles.Rapid sex-linked cis-regulatory evolution may therefore be a universal phenomenon across species.
Y chromosomes often have much lower diversity overall, relative to neutral expectations (Lawson Handley et al. 2006;Wilson Sayres et al. 2014).This pattern has been attributed to purifying selection removing deleterious mutations and the linked neutral variation throughout the nonrecombining region (Wilson Sayres et al. 2014;Wilson Sayres 2018).Our findings revealed that nucleotide diversity on the threespine stickleback Y chromosome is also much lower than neutral expectations.Our results suggest positive selection within ACRs could also be an important driver of reducing nucleotide diversity on the Y chromosome within populations.One important consideration is whether sex-biased demography may affect Y-linked diversity.If males are more variable in reproductive success than females, this could lower the expected effective population size of the Y chromosome, reducing nucleotide diversity.Little is known about the operating sex-ratios of stickleback fish.Some populations exhibit sex ratios that are femalebiased (Rollins et al. 2017).This could contribute to the low nucleotide diversity we observed.However, simulations in human populations revealed that even drastic shifts in sex ratio could not entirely explain the low diversity observed on the Y chromosome (Wilson Sayres et al. 2014).

Conclusion
Together, our results provide evidence of positive selection driving accelerated rates of nucleotide substitution in cis-regulatory elements.Signatures of positive selection, even in the youngest 2 strata, indicate that cis-regulatory evolution can proceed rapidly following the suppression of recombination, leading to reduced gene expression from the Y chromosome.These results support some aspects of recent models of sex chromosome evolution, where cis-regulatory degeneration and silencing occur first on the Y chromosome.However, we found this can occur in the absence of dosage compensation, contrasting other assumptions of these models.Improvements in functional annotations of regulatory regions as well as an evergrowing collection of high-quality Y and W assemblies will allow continued empirical testing of new regulatory models of sex chromosome evolution.
We used a Tn5 control sample to normalize ATAC-seq reads to remove the effect of Tn5 bias.For the control, genomic DNA was extracted from a caudal fin clip of 1 male fish using a standard phenol-chloroform extraction.The ATAC-seq library preparation, whole-genome sequencing library preparation, and sequencing (Illumina HiSeq 2 × 150bp) of the control sample were completed by GENEWIZ (New Jersey, USA).Whole-genome sequencing was conducted in order to show the DNA sample was of sufficient quality to construct the Tn5 bias control.For the wholegenome sequencing, we recovered over 232 million reads with an average quality score of 38.59, indicating a highquality sample.We trimmed residual adapters and lowquality sequences using Trimmomatic as previously described.Trimmed reads were aligned to the threespine stickleback genome using Bowtie2.The read coverage per base pair was calculated using BEDTools (v2.29, -d) (Quinlan and Hall 2010).The whole-genome sequencing sample had an average read depth of 90 × where only ∼10% of the genome was supported with <10 reads.Within these regions, only 6% had zero reads per base pair, indicating 94% of the genome could be queried for biased integration of Tn5.The Tn5 bias control produced over 177 million reads.The reads were trimmed from these reads using Trimmomatic with the same parameters.The trimmed reads were aligned to the threespine stickleback genome using Bowtie2 with default parameters.In order to use previous alignment coordinates between the X and Y chromosomes, we used the v.4 genome build of the X chromosome (Peichel et al. 2020).We used the v.5 genome build of chromosome 18 (Nath et al. 2021).Reads mapping to the mitochondria and unscaffolded regions were removed.PCR duplicates were removed using MarkDuplicates from Picard.
We evaluated concordance between replicates in several ways.We first searched for enrichment of ATAC-seq reads around the transcription start sites (TSSs) of expressed genes.Iso-Seq long-read sequencing was previously conducted in threespine stickleback fish to curate accurate transcription start site annotations across multiple tissues (Naftaly et al. 2021).We used deepTools (v3.5.1) computeMatrix reference-point (Ramírez et al. 2016) on the ATAC-seq alignments to assay read depth 3 kb around the complete set of annotated TSSs.We plotted enrichment of ATAC-seq reads around TSSs and compared with RNA-seq expression using deepTools plotHeatmap with default settings, except that we used the -sortUsingsamples to sort the regions by expression.We found strong enrichment around TSSs, across both replicates for the same set of expressed genes (supplementary fig.S6, Supplementary Material online).ACRs were called for each replicate using MACS2 (v2.2.7.1) callpeak with the -keep-dup all parameter, and read depth was normalized with Tn5 control sequencing with the -c parameter (Zhang et al. 2008).The majority of ACRs were shared across replicates (liver: 67% and testis: 61%) We then used the P-value of each MACS2 peak to calculate the Irreproducibility Discovery Rate (IDR) (https://github.com/daniel-shaw1/Regulatory_divergence_paper/blob/main/ Supplemental/IDR.r),(supplementary fig.S7, Supplementary Material online).The IDR determines the probability that a peak is reproducible between replicates, due to chance (Qunhua et al. 2011;Landt et al. 2012).Replicates that have IDR < 0.05 are generally considered to be highly reproducible.However, the IDR test is considered very stringent (Jalili et al. 2015), potentially removing many true regions of increased accessibility.Because both replicates exhibited high concordance, we pooled the aligned reads from both replicates and called peaks from the joint alignment using MACS2 as previously described, as pooling raw reads has been previously reported (Reske et al. 2020;Yan et al. 2020) as a strategy to find high-quality enriched regions.We found that most ACRs (liver: 79%, testis: 75%) from the pooled sample overlapped with low-scoring ACRs from the individual replicates (IDR < 0.05).This suggested a majority of ACRs in the pooled dataset were of high-quality in the individual replicates.
To define an ancestral set of ACRs, we mapped the nucleotide sequence of each noncoding ACR on the X chromosome to the homologous region on the Y chromosome using previously generated alignment coordinates (Peichel et al. 2020).A similar approach has been used previously to define the ancestral set of coding regions, aligning X-linked gene annotations to the Y chromosome (Peichel et al. 2020).We also attempted to identify Y-linked ACRs and calculate divergence in a similar method as we identified with X-linked ACRs.However, we found that MACS2 identified significantly fewer enriched regions of accessible chromatin on the Y chromosome compared to the X chromosome.The few ACRs that were identified also aligned to multiple regions on the X chromosome, suggesting these regions were repetitive.Using the X chromosome ACRs, we identified orthologous regions between the X chromosome, Y chromosome, and orthologous autosome (chromosome 19) from the ninespine stickleback fish (Pungitius pungitius; (Varadharajan et al. 2019) using BLAST+ (blastn v. 2.11.0) with default blastn parameters and -perc_identity set to 75.(Camacho et al. 2009).To ensure high-quality alignments, we filtered for uniquely mapping alignments that also had a bit-score >100.We used BLAST+ alignments to extract ACRs and the proximal 50 bp upstream and downstream, to create multiple sequence alignments for downstream analysis.We aligned the threespine stickleback X and Y ACR sequences to the Shaw et al. • https://doi.org/10.1093/molbev/msae020MBE ninespine stickleback autosome sequence using MUSCLE (v 3.8.1551)(Edgar 2004) with default parameters.We used MUSCLE to create a multiple sequence alignment between the X sequence, Y sequence, and the sequence from the orthologous autosome in order to identify X-and Y-specific variants.We called single nucleotide variants using snp-sites (v2.5.1) (Page et al. 2016) with default parameters.We calculated the reported divergence rate by dividing the number of variants by the number of the aligned sites.
We compared ACR divergence on the sex chromosomes to ACR divergence on autosome 18, a chromosome similar in length to the X chromosome.We identified orthologous regions between the threespine stickleback autosome 18 and autosome 18 from the ninespine stickleback fish using BLAST+ (blastn v. 2.11.0) (Camacho et al. 2009) as previously described.We called variants between autosomal regions in a similar manner as the sex chromosomes, by creating a pairwise alignment using MUSCLE (v 3.8.1551)with default parameters and called single nucleotide variants using the same python script as above.We calculated a divergence rate by dividing the number of single nucleotide variants by the size of the aligned region.
Divergence within ACRs was compared with the divergence of coding regions of neighboring genes.We identified the closest gene to each ACR by running annotatepeaks.pl in homer (v 4.11) (Heinz et al. 2010) using default settings.Per default settings, ACRs were assigned to the closest gene within 50 kb of the TSS.We verified the homer peak annotations with a custom Python script that identifies ACRs near the TSS of genes (https://github.com/daniel-shaw1/Regulatory_divergence_paper/blob/main/ Supplemental/Peaksneargenes.py).Both methods were 100% concordant with each other.We further verified a subset of the annotations (200 peaks total) by visualizing the ACRs in the threespine stickleback JBrowse genome browser (https://stickleback.genetics.uga.edu/).We found that 100% of the 200 X-linked ACRs were located proximal to the predicted annotated gene (supplementary fig.S8, Supplementary Material online).For coding divergence comparisons, we used previously reported estimates of synonymous (d S ) and nonsynonymous (d N ) divergence of gametologs between the X and Y chromosomes (Peichel et al. 2020).Coding regions with d N /d S ratios of 99 were omitted as these represent alignments with very little sequence divergence to estimate d N /d S accurately.

Estimating a Control Substitution Rate
To generate control regions, we randomly sampled nonfunctional intergenic regions throughout the X chromosome, Y chromosome, or chromosome 18 to estimate a neutral substitution rate.Regulatory regions tend to be GC-rich, which can be prone to higher mutation rates through mechanisms like GC-biased gene conversion or spontaneous deamination of methylated cytosines (Nesta et al. 2021).We therefore GC-matched the randomly drawn intergenic regions with the GC content of ACRs.We calculated the GC percentage of ACRs using a modified Perl script (countbp.pl,Nicholas Navin, http://www.navinlab.com/bioperl/bioperl/gc_content.html).For each ACR, a random intergenic region was drawn from the X chromosome equal in length and with a GC content within 2.0%.Intergenic regions were defined as any region that fell outside of annotated functional regions from a combination of Ensembl annotations (Cunningham et al. 2022), Isoseq transcripts (Naftaly et al. 2021), Y chromosome annotations (Peichel et al. 2020), repetitive elements (Peichel et al. 2020;Nath et al. 2021) and ACRs from this study.This was performed for each stratum individually using bedtools shuffle -I ACRs.bed -incl XYalignment.bed-excl annotations.bed.We extracted 10,000 sets of GC-matched intergenic regions from each of the 3 evolutionary strata on the threespine stickleback sex chromosomes (set size same as number of Liver ACRs: stratum 1: 442, stratum 2: 228, stratum 3: 199; Testis ACRs: stratum 1: 30, stratum 2: 37, stratum 3: 51).Due to a limited number of intergenic sites shared between the sex chromosomes, we sampled with replacement to align a sufficient number of intergenic regions.We identified the orthologous regions to the ninespine genome assembly using BLAST+ (blastn v2.11.0) (Camacho et al. 2009).We filtered for alignments that mapped uniquely to chromosome 19 in the ninespine stickleback fish, had an alignment length equal to 75% of the threespine query sequence, and had a bit-score >100.We generated multiple sequence alignments using MUSCLE (v 3.8.1551),for each intergenic permutation and calculated X-and Y-specific substitution rate as previously described.

Estimating Within Population Nucleotide Diversity
We used whole-genome short-read sequencing of 12 males from the Lake Washington population (Washington, USA; NCBI SRA SRP137809) to estimate nucleotide diversity across the sex chromosomes.The raw reads were trimmed with Trimmomatic (v.0.39) (Bolger et al. 2014), using a sliding window of 4 bases, trimming the remainder of the read when the average quality within a window dropped <15.The leading and trailing base pairs below quality 3 of every read were removed along with any residual adapter sequence.After trimming, any reads below a minimum length of 36 were discarded.We aligned the trimmed reads using Bowtie2 (v.2.4.5) with default parameters (Langmead and Salzberg 2012).We also marked PCR duplicates by using the MarkDuplicates function in Picard (v.2.26.10) (https://github.com/broadinstitute/picard).Single nucleotide polymorphism (SNP) genotyping was conducted using the GATK software package (v.4.2.5.0).We called variants using HaplotypeCaller in genomic variant call format (GVCF) mode.We then joint-called variants using GenotypeGVCFs using a genomics database created by GenomicsDBImport.
We only considered biallelic SNPs in our estimates of nucleotide diversity and we filtered for high-quality genotypes using several different methods.On the sex chromosomes, sites should only have hemizygous genotypes in males in the noncrossover region outside of the pseudoautosomal region.Any sites that are heterozygous would be caused by errors in read alignment.We therefore did not consider Positive Selection Drives cis-regulatory Evolution • https://doi.org/10.1093/molbev/msae020MBE sites that were heterozygous in the noncrossover region on the X or Y chromosomes.We also filtered out sites that exhibited too low or high of read depth, which would be indicative of alignment errors.To do this, we did not consider sites that were less than one-half or greater than double the median read depth of each chromosome (X and Y chromosomes: positions were retained if the read depth was between 3.5 and 14; chromosome 18: positions were retained if the read depth was between 6.5 and 26).Read filtering was conducted using custom Perl scripts.

Testing for Positive Selection Using Population Polymorphism
We performed a modified version of the MK test (Andolfatto 2005) by treating divergence (D) and diversity (P) within ACRs as the nonsynonymous factor, and nonfunctional intergenic sequence previously generated was used as synonymous factor.As intergenic sequence is independent of each ACR, we pooled sites from each factor as previously described (Andolfatto 2005).We counted the total number of polymorphic sites or fixed substitutions (divergence) for each factor to calculate α, the fraction of substitutions that are likely fixed due to positive selection.Divergence was estimated either between the threespine stickleback X and Y chromosomes or between the threespine and ninespine stickleback autosome 18. Negative alpha values were interpreted as regions that have rates of adaptive evolution near zero (Good et al. 2013).We also used these counts to estimate the NI (Rand and Kann 1996).An NI >1 indicates an excess of ACR divergence within species, while an NI <one indicates an excess of ACR divergence between species (or between the X and Y).An index <1 would be consistent with positive selection.To test for significant departures from neutrality within each stratum or chromosome we performed a chi-square test to compare the rates of d N /d S to p N /p S .Standard deviation was determined through non-parametric bootstrapping.We bootstrapped 10,000 replicates for each factor within the MK or NI test and computed both metrics for each replicate.
We also performed an MK test on coding regions.We aligned the coding sequence of all threespine stickleback X chromosome Ensembl predicted genes (build 95) to the Y chromosome and the ninespine stickleback outgroup (autosome 19; v. 7) (Varadharajan et al. 2019) using Exonerate (v.2.4.0)(Slater and Birney 2005) with parameters: -model coding2genome -M 2000 -bestn 1 -annotation.The Ensembl coding regions from the threespine stickleback autosome 18 were also aligned to the ninespine stickleback autosome 18.The highest-scoring alignment from Exonerate was used for the MK test.Divergence and polymorphism were tallied at each position of the alignment.If a position was variable, we determined if this was a synonymous or nonsynonymous mutation in the threespine stickleback by comparing it to the ninespine stickleback translated codon.Sites were omitted if they contained any N characters or contained any missing genotype data among the 12 males within the variant call format (VCF) file.Codons were omitted if they contained any gaps from the alignment.We only considered biallelic sites when counting polymorphisms and fixed substitutions within the Lake Washington population.We used custom Perl scripts to count the total number of divergent and polymorphic sites.To compare to ACRs, we calculated α and NI by pooling sites across all coding regions.For gene-by-gene MK tests, we log-transformed NI to include genes with a count of zero for any particular category of divergence or polymorphism and refer to the value as NIHaldane (Haldane 1956;Stoletzki and Eyre-Walker 2011).

Comparison With Allele-specific Expression Patterns
We measured allele-specific expression of transcripts on the X and Y chromosome using 3 replicate liver samples (NCBI BioProject PRJNA591630) and 3 replicate testis samples (NCBI BioProject PRJNA591630).We trimmed the RNA-seq reads with Trimmomatic (v.0.36) (Bolger et al. 2014) using a sliding window of 4 bases, trimming the remainder of the read when the average quality within a window dropped below 20 (SLIDINGWINDOW:4:20). Residual sequencing adapters were also removed using ILLUMINACLIP.We aligned RNA-seq reads using Tophat2 (v2.2.1) with default parameters (Kim et al. 2013).We filtered for alignments with a map quality greater than 25 using SAMtools (v1.14) (Li et al. 2009) to identify reads that map uniquely on the X-and Y-alleles.Similar map quality filters have been used to distinguish short reads between sex chromosomes in Drosophila and stickleback fish (Ellison and Bachtrog 2019;Peichel et al. 2020) and between sexes with heteromorphic sex chromosomes in guppies (Kirkpatrick et al. 2022).Read counts were obtained using htseq-count (v 0.9.1) (Anders et al. 2015).Default parameters were used with the addition of -stranded = no and -nonunique all to maximize our ability to count reads on the Y chromosome.We used a custom gene transfer format (GTF) file for htseq-count that included previously determined start and end sites of all Ensembl predicted transcripts that are shared between the X and Y chromosomes (Peichel et al. 2020).Reads were counted on the X and Y across the entire transcript as a feature.Transcripts were removed from the analysis if they had an RNA expression count of 0 in all samples.For each tissue, we calculated the average read count for each gametolog across the 3 tissue replicates.We calculated the allele-specific expression as log 2 -(X transcript average/Y transcript average).
Institutes of Health R01GM147312 (M.A.W.) and T32GM007103 (A.S.N.), and by the University of Georgia Research Foundation (D.E.S.).We would like to thank Bob Schmitz and Casey Bergman for helpful discussions related to the project.

MBEFig. 2 .
Fig. 2. Accessible chromatin divergence between threespine and ninespine stickleback.a) Predicted substitution rates within accessible regions.ACR substitution rates (red) lower than random nonfunctional control regions (white) and synonymous substitutions (blue) would indicate purifying selection (left).ACR substitution rates roughly equal to nonfunctional control regions would indicate mutations are accumulating due to inefficient selection from the loss of recombination (middle).ACR substitution rates higher than nonfunctional control regions would indicate positive selection.b) Y-linked ACR divergence is elevated over nonfunctional intergenic regions.X and Y ACR divergence was compared to randomly drawn nonfunctional regions matched for GC content (10,000 permutations).The average ACR divergence of each stratum is shown by the dotted line (stratum 1: pink, stratum 2: green, and stratum 3: blue).ACRs were identified on the X chromosome and mapped to the Y chromosome and homologous ninespine stickleback autosome 19 from liver (top) and testis (bottom).Autosome divergence is from chromosome 18.For the sex chromosomes, we identified variants that were unique to the X and Y, by aligning to the outgroup and ignoring fixed differences between species that likely occurred before the sex chromosomes evolved.X-and Y-specific variants within ACRs were summed and divided by the total length of the ACR to calculate divergence.Stratum 1: 442 liver ACRs and 30 testis ACRs; stratum 2: 228 liver ACRs and 37 testis ACRs; stratum 3: 129 liver ACRs and 46 testis ACRs; autosomal: 1056 liver ACRs and 305 testis ACRs.Graphic made with BioRender.com.
Fig. 2. Accessible chromatin divergence between threespine and ninespine stickleback.a) Predicted substitution rates within accessible regions.ACR substitution rates (red) lower than random nonfunctional control regions (white) and synonymous substitutions (blue) would indicate purifying selection (left).ACR substitution rates roughly equal to nonfunctional control regions would indicate mutations are accumulating due to inefficient selection from the loss of recombination (middle).ACR substitution rates higher than nonfunctional control regions would indicate positive selection.b) Y-linked ACR divergence is elevated over nonfunctional intergenic regions.X and Y ACR divergence was compared to randomly drawn nonfunctional regions matched for GC content (10,000 permutations).The average ACR divergence of each stratum is shown by the dotted line (stratum 1: pink, stratum 2: green, and stratum 3: blue).ACRs were identified on the X chromosome and mapped to the Y chromosome and homologous ninespine stickleback autosome 19 from liver (top) and testis (bottom).Autosome divergence is from chromosome 18.For the sex chromosomes, we identified variants that were unique to the X and Y, by aligning to the outgroup and ignoring fixed differences between species that likely occurred before the sex chromosomes evolved.X-and Y-specific variants within ACRs were summed and divided by the total length of the ACR to calculate divergence.Stratum 1: 442 liver ACRs and 30 testis ACRs; stratum 2: 228 liver ACRs and 37 testis ACRs; stratum 3: 129 liver ACRs and 46 testis ACRs; autosomal: 1056 liver ACRs and 305 testis ACRs.Graphic made with BioRender.com.

Fig. 3 .
Fig. 3. Quantifying adaptive divergence and departure from neutrality among ACRs and coding regions.Sex-linked ACRs exhibited signatures of positive selection in both liver (left) and testis (middle).Sex-linked coding regions (right) are mostly under purifying selection or not significantly different than neutral expectations.The NI was calculated using the odds ratio of the McDonald Kreitman test.NI > 1 reflects purifying selection while NI < 1 is a signature of positive selection.Asterisks denote significant departures from neutrality determined by a chi-square test.Error bars represent 2 standard deviations calculated from 10,000 bootstrap replicates.

Fig. 4 .
Fig. 4. Gametolog expression level compared to accessible chromatin region divergence.ACR divergence on the X and Y chromosomes is correlated with X-biased gene expression in both liver and testis tissues.RNA-Seq transcript counts that uniquely mapped to the X and Y chromosome were quantified to determine allele-specific expression for all gametologs.Gametolog expression was compared to the average divergence of Y-specific (a and b) and X-specific (c and d) mutations for ACRs within 50 kb of expressed genes.a) Y liver: R = 0.40, P = 0.004, N = 51; b) Y Testis: R = 0.201, P = 0.26, N = 31; c) X Liver: R = 0.30, P = 0.033, N = 51; d) X Testis: R = 0.07, P = 0.94, N = 31.Shaded regions represent 95% confidence intervals based on bootstrapping.The regressions were not significantly different between the X and Y chromosome for liver (P = 0.590, Fisher transformation) or testis (P = 0.077, Fisher transformation).

d
N /d S (supplementary fig.S2, Supplementary Material online; each stratum on X and Y: P > 0.05; Spearman's rank correlation), indicating coding regions with elevated ACR divergence are not more likely to have elevated substitutions in the amino acid sequence.Elevation of d N /d S on the Y chromosome could indicate relaxed purifying selection or positive selection.Combining genes that are evolving under both could mask signatures of adaptive silencing.To account for this, we split gametologs on the Y chromosome into classes based on the signatures of selection from the MK test.We grouped gametologs into categories of purifying selection (NIHaldane > 0.25; N = 262), positive selection (NIHaldane < −0.25; N = 41), and genes evolving through relaxed purifying selection (−0.25 >NIHaldane > 0.25; N = 219).If selection was acting on ACRs to downregulate genes with deleterious mutations, we would expect to see a positive correlation between ACR divergence and d N /d S for genes with signatures of relaxed purifying selection.In contrast to this prediction, we found a weak negative correlation between ACR divergence and d N /d S for expressed genes under relaxed purifying selection (supplementary fig.S3, Supplementary Material online; R = −0.293;P = 0.0155; N = 35; Spearman's rank correlation).These genes also did not have more X-biased expression than other genes (supplementary fig.S4, Supplementary Material online; P > 0.05; Kruskal-Wallis test).Interestingly, we did find a positive correlation between ACR divergence and d N /d S for genes evolving under positive selection (supplementary fig.S4, Supplementary Material online; R = 0.92; P = 0.002; N = 10; Spearman's rank correlation).

Table 2
MK test for coding regions

Table 1
MK test for noncoding ACRs polymorphism (Table2).A gene-by-gene MK test revealed that 50% of X-linked gametologs had an NI > 1, compared to only 25% which had an excess of amino acid substitutions (supplementary material, Supplementary Material online).Importantly, sex-linked coding regions were not under widespread positive selection, like ACRs.some is predicted to contain 75% of the nucleotide diversity compared to the autosomes.We found that the nonpseudoatusomal region (non-PAR) regions of the X chromosome had 45% of the amount of nucleotide diversity as autosomes, indicating the X chromosome may also be under selection, reducing nucleotide diversity beyond neutral expectations (non-PAR X chromosome pi: 0.0019; autosome pi: 0.0043; supplementary fig.S1, Supplementary Material online).We found that the Y chromosome had consistent signatures of low nucleotide diversity across all strata (supplementary fig.S1, Supplementary Material online).However, there were distinct differences in nucleotide diversity for each stratum and the PAR on the X chromosome (supplementary fig.S1, Supplementary Material online; P < 0.001 for all comparisons; Kruskal-Wallis and Dunn's test), suggesting each region of the X chromosome may be under different selection pressures depending on gene and regulatory content.

Table 3
Gametologs with Y-biased expression in testes